Numerical Simulation to Predict COVID-19 Cases in Punjab

Coronavirus disease 2019 is a novel disease caused by a newly identified virus, Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2). India recorded its first case of COVID-19 on 30 January 2020. This work is an attempt to calculate the number of COVID-19 cases in Punjab by solving a partial differential equation using the modified cubic B-spline function and differential quadrature method. The real data of COVID-19 cases and Google Community Mobility Reports of Punjab districts were used to verify the numerical simulation of the model. The Google mobility data reflect the changes in social behavior in real time and therefore are an important factor in analyzing the spread of COVID-19 and the corresponding precautionary measures. To investigate the cross-border transmission of COVID-19 between the 23 districts of Punjab with an analysis of human activities as a factor, the 23 districts were divided into five regions. This paper is aimed at demonstrating the predictive ability of the model.


Introduction
SARS-CoV-2 was recognized as a new disease known as coronavirus disease 2019 (COVID- 19) and was declared a pandemic by the WHO on March 11, 2020. In India, three cities in Kerala reported the first incident of COVID on January 30. The country was placed under total lockdown on March 25, 2020, which lasted until May 31, 2020. After that, the towns began to reopen gradually. Both infection rates and new and active patients began to decline in September. By January 2021, the number of daily cases had decreased to less than 15,000 from more than 90,000 in mid-September. The second wave, which began in March 2021, was far more devastating than the first. Regions of the country suffered shortages of vaccines, oxygen cylinders, hospital beds, and other medical supplies. By the end of April, India had surpassed the rest of the world in terms of new and active cases. On January 16, 2021, India began its vaccination program with the AstraZeneca vac-cine (Covishield) and the indigenous Covaxin. Sputnik V and the Moderna vaccine were later licensed for emergency use as well. According to the CoWIN portal, India reached 100 crore (1 billion) vaccination doses on October 21, 2021, at 9:47 am [1]. A spatiotemporal PDE model is used in this work to forecast COVID-19 cases in Punjab, India. The model utilized the combined impact of the transboundary spread throughout Punjab regions with the impact of social distance effects. Many spatiotemporal models, such as [2][3][4][5], use PDEs to define infectious illnesses. This work is an attempt to predict COVID-19 cases using real-world data collected from the Google Community Mobility Reports. As locations provide real-time information about changes of population in their social behavior, mobility trends derived from location history are crucial to study COVID-19 spread and prevention.
There are some limitations with Google mobility data due to the dynamic nature of mobility patterns. One of the major reasons is that cell phones may not accurately reflect population mobility, especially in suburban areas where people are not using GPS technology [6].
This manuscript is divided into ten sections. Section 1 briefly explains the problem. In Section 2, the data are explained using the PDE model in Section 3. Sections 4 and 4 briefly explain the differential quadrature method and the B-spline functions, with more details in Section 6. Sections 7 and 8 contain the prediction method and parameter estimation. Results are presented in Section 9, followed by concluding remarks in Section 10.

Description of the Data
In this work, the model is applied for the period July 1, 2020, to August 15, 2020. During this time, India was under unlock phases 2 and 3 when few places were reopened in a phased manner.
There are twenty-three districts in Punjab. Malerkotla is the 23 rd district of Punjab. The district was carved out of Sangrur district on 14 May 2021, but here, it has been taken as a part of Sangrur district as the data for Malerkotla district was not available separately. The twenty-two districts have been grouped into five regions as follows ( Figure 1): (i) Region 1. Pathankot, Gurdaspur, Amritsar, Tarn Taran, and Kapurthala.
By summarizing the new COVID-19 cases from all districts in a region, the daily new cases of that region are calculated. Two time series of Google mobility reports have been created. The first category includes activities such as grocery stores and pharmacies, retail and recreational activities, workplaces and parks, and transit stops that are thought to increase COVID-19 cases; the second category of activities focuses on residential activities (activities at home) that may help prevent COVID-19 epidemics. The daily changes of each region are calculated by aggregating the changes of all counties within a region.

The PDE Model
The considered PDE model can be categorized as an internal/ local process within each region and an external/global process. In the local process, people can become infected by social interactions with infected individuals within a region but can take steps to prevent COVID-19 from spreading, while in the global process, it is possible to become infected through social interactions with those who have been infected outside of a particular region. The considered five regions are embedded in a Euclidean space and mapped on the x-axis, with coordinates for the five regions [7]. The close arrangement of regions ensures a continuous model to study the spread of COVID-19 in the regions.
Let Cðx, tÞ denote the COVID-19 cases in the Punjab region x at a particular time t. The model is defined as follows:   [3,8] to describe the geographical spread of infectious diseases. In this case, dðxÞ is assumed to be constant, which means that dðxÞ ≡ d > 0 (ii) rðtÞlðxÞaðx, t − 14ÞCðx, tÞ denotes new cases in a region x at time t, which is implemented frequently to discuss the growth of germs, cancers, and information over time [8] (a) The function rðtÞ > 0 represents the growth rate in COVID-19 cases at time t for all Punjab regions. Here, rðtÞ is assumed to increase with time t as the number of COVID-19 cases increases. Among the choices for representing the pattern, rðtÞ = gðb 1 + b 2 tÞ and gðuÞ = 1/ð1 + exp ð−uÞÞ have been chosen with the parameters b 1 and b 2 , which are to be determined from COVID-19 The spatial diversity of COVID-19 is described by the location function lðxÞ, which exhibits various infection rates in the five Punjab regions. The function lðxÞ can be created by using cubic spline interpolation in MATLAB. The gathered COVID-19 data is used to determine lðxÞ (c) aðx, t − 14Þ represents data taken from Google mobility reports for retail and recreation, groceries and pharmacies, parks and transit stations, and workplaces, all of which are expected to increase the number of COVID-19 cases. Because the effect of the activities may take up to two weeks, the PDE model time t is shortened by 14 days to account for the COVID-19 incubation period (iii) The function cðhðx, t − 14ÞCðx, tÞ/ðk + Cðx, tÞÞÞ shows a reduced rate of COVID-19 cases due to high precaution measures to limit contact. The Michaelis-Menten function was used to limit the impact of household activities on the spread of COVID-19 (a) hðx, t − 14Þ describes the data of residential activities that help to prevent COVID-19 outbreaks, collected by Google mobility data, where t − 14 is the incubation period (b) Here, c denotes the maximum reduction rate for each region with the effect of government actions and personal precautions (c) k is the number of COVID-19 cases where the reduction rate is ð1/2Þc aðx, t − 14Þ and hðx, t − 14Þ are obtained using Google mobility data, while the other constants d, k, and c and parameters of rðtÞ and lðxÞ are evaluated by the data of COVID-19 cases Cðx, 1Þ = ψðxÞ is the initial function that describes the beginning states of COVID-19 in each region of Punjab and can be created by using cubic spline interpolation in MATLAB from the collected data of COVID-19 cases To solve the PDE model, the boundary conditions are considered of Neumann type [8] given as For simplicity, cases imported from neighboring states are counted as local Punjab cases, assuming that no COVID-19 spreads across the border when x = 1 and 5.
Many mathematical models have been used to discuss the mathematics intervention in the study of the spread of infectious diseases [9][10][11]. The classic models, namely, the susceptible, infectious, and recovered (SIR) model [12] and the susceptible, exposed, infectious, and recovered (SEIR) model [13], are the models generally used to study the spread and after-effects of COVID-19. Other proposed models are based on differential equations and statistical models [14,15]. The present PDE model adopted in the study is based on the community model [7]. This model includes the impacts of personal precautions such as wearing face masks and maintaining social distance on COVID-19 cases at the district level. To the best of our knowledge, this work is the first attempt to apply PDE models on COVID-19 prediction with the Google Community Mobility Reports in Punjab, India.

Differential Quadrature Method
The differential quadrature method is a technique for approximating the derivatives of a function as a linear summation of function at the discrete knot points in the domain [16]. It was developed by the late Richard Bellman and his associates in the early 1970s. The following summation is used to approximate the derivatives of function uðxÞ at any nodal point x i of the partition. Here, a ij are constant coefficients known as weighting coefficients which can be obtained in many ways. The present work is intended to implement the B-spline basis functions of the third degree with modification in DQM to determine the weighting coefficients.
The idea of using the spline-based differential quadrature method to determine weighting coefficients was proposed by Bellman et al. [16], which was researched and improved by Quan and Chang [17,18]. DQM has been implemented with various basis functions including the sinc functions, spline functions, and Lagrange interpolation polynomials [17,18]. A lot of work has been reported in the 3 Computational and Mathematical Methods in Medicine literature for the advancement in the methodology for the selection of the grid points [19][20][21]. The Cubic B-spline has also shown accurate results for solving various PDEs using the DQM approach [22][23][24][25][26]. Third-order B-splines have advantages in terms of computational simplicity, numerical stability, and interpolated curve smoothness. Since both DQM and cubic B-spline functions have proven to be very effective, we chose to use this approach.

B-Spline
A spline of order n is a piecewise polynomial function of degree n − 1 in a variable x.
The values of x where the pieces of the polynomial meet are known as knots, denoted as x 0 , x 1 , x 2 , ⋯, x n , and sorted into nondecreasing order.
The third-degree B-spline called as the cubic B-spline basis function is given by the formula defined below with h as the knot span with the uniformly distributed knots, i.e., From the definition, the values of B i ðxÞ and its first and second derivatives at the nodal points can be tabulated.
The cubic B-spline basis functions are further implemented in the modified form to make the system diagonally dominant, defined as follows: Therefore, fM 1 , M 2 , ⋯, M m g now form a basis over the region.

Description of the Method
The first-order derivative approximation is given by Now, for the first knot point x 1 , that is, i = 1, a 1j M p x j À Á , for p = 1, 2, ⋯, m: ð7Þ h 2 = 6a 11 + 1a 12 + 0a 13 + 0a 14 The "Thomas algorithm" is used to solve the above tridiagonal system of equations. The solution of which provides the coefficients a 11 , a 12 , ⋯, a 1m .

Computational and Mathematical Methods in Medicine
Following the same process from i = 2 to i = m − 1, the value of coefficients a i1 , a i2 , ⋯, a im is calculated. Hence, the weighing coefficients a ij for i = 1, 2, ⋯, m and j = 1, 2, ⋯, m have been evaluated. Using these coefficients, the weighing coefficients b ij for i = 1, 2, ⋯, m and j = 1, 2, ⋯, m are calculated by

Prediction Method
The implementation of the proposed scheme (DQM and modified cubic B-spline) turns the function derivative ð∂/∂ xÞðdðxÞð∂Cðx, tÞ/∂xÞÞ into a linear sum of function values at discrete points, which is further solved by the R-K approach using MATLAB. The values of the model parameters are to be determined by the collected data of COVID-19 cases in the five regions. In this work, the COVID-19 cases have been predicted one day ahead.
In order to predict COVID-19 cases, the parameters have first been trained and then the PDE is solved for prediction. The following algorithm explains the prediction procedure [7]: (i) The data of the number of COVID-19 cases has been collected for the period July 1, 2020, to August 15, 2020. So there are a total of 46 days in this period (ii) The historical COVID-19 data for days 1-7 has been used to train the parameters for each region (iii) The fourth-order Runge-Kutta method is used in MATLAB to numerically solve the equation for prediction for day 8  Computational and Mathematical Methods in Medicine (iv) The collected data for days 2-8 is used to train the parameters and predict for day 9. This process is repeated till day 46 is predicted from the data collected for days 39-45 (v) The actual COVID-19 data for days 8-46 are then compared to the predicted data to check the accuracy

Estimated Parameters
The following are the values of the parameters used for the prediction of COVID-19 cases on the 8 th , 9 th , and 10 th days: (i) lðxÞ represents the different infection rates of the five regions of Punjab. The daily infection rate is 7 days moving average of new cases per 100,000 residents (ii) As rðtÞ is the growth rate for all Punjab regions, therefore, b 1 and b 2 have only one value for each prediction step. Table 1 shows the value of l i for regions 1-5, respectively, and the constants b 1 and b 2 (iii) aðx, t − 14Þ is a collection of data for retail and recreation, groceries and pharmacies, parks, transit stations, and workplaces from Google mobility data. The daily changes for each region were calculated by aggregating the changes of all counties within a region. The negative values of aðx, t − 14Þ reflect the decrease in mobility during this period compared to the baseline period (iv) The hðx, t − 14Þ entry represents the daily changes in residential activities of each region, which were calculated by aggregating the changes of all counties within a region. The positive values of hðx, t − 14Þ  Table 2 represents the value of aðx, t − 14Þ, hðx, t − 14Þ, and d > 0 for the five regions (vi) The calculated values of c and k are shown in Table 3 9. Prediction Results and Accuracy , we can see that for region 1 and region 2, the curve for the predicted and actual values intersects at many points, implying that the predicted value for that day is exactly the same as the actual value. For region 3, the curves do not intersect for more than one point and the predicted data has smaller values compared to the actual data, but both curves follow a similar pattern. For regions 4 and 5, the curves meet at few points and the predicted data has higher values than the actual data for the rest of the points. We can observe that for most of the points, the difference between both data is not huge and can be improved. To analyze the model accuracy of prediction, the following formula has been used [7]: where x real represents the actual COVID-19 cases and x predict represents the predicted COVID-19 cases. The accuracy for days 8, 9, and 10 for each region of Punjab is shown in Table 4. The median of accuracy for regions 1-5 from day 8 to day 46 is 63.79%, 59.45%, 46.22%, 14.55%, and -10.91%, respectively. The accuracy achieved is relatively low, which  Computational and Mathematical Methods in Medicine could be due to certain limitations of this model as it does not include recovered cases. A person who has recovered from COVID-19 has low chances of being infected again, which affects the rate of infection. Google mobility reports also have certain limitations. As Punjab has many rural areas where GPS-enabled smartphones are not widely used, this affects the reliability of Google mobility data.

Conclusion
This work is an attempt to analyze the effectiveness of a partial differential equation model to predict the number of COVID-19 cases in Punjab, India. For this, 23 districts of Punjab were distributed into 5 groups/regions. The realtime data of COVID-19 as well as the Google mobility reports have been used for the calculation. The differential quadrature method and modified cubic B-spline functions have been used to convert the PDE into an ordinary differential equation, which has been further solved using the fourth-order Runge-Kutta method. All the numerical simulations have been performed using MATLAB. The actual COVID-19 data and the predicted data for regions 1-5 have been compared. From Figures 2-6, it can be seen that for many points, the model gives predicted values exactly the same as the actual data. The accuracy of the model for regions 1-5 from day 8 to day 46 is 63.79%, 59.45%, 46.22%, 14.55%, and -10.91%, respectively. The PDE model and the techniques used are found to be effective. A lot of work done has been proposed by researchers to analyze the spread of COVID-19 using an ordinary differential equation model, but a PDE model is rare. Keeping certain limitations of this model into consideration, this work shows the  Abbreviations PDE: Partial differential equation GPS: Global Positioning System DQM: Differential quadrature method.

Data Availability
The data used for the analysis of the model is available on http://www.google.com/covid19/mobility/. COVID-19 data for daily new cases can be found at https://www .covid19india.org/state/PB and https://www.mohfw.gov.in/.